Topic Model Based Media Program Genome Generation For A Video Delivery System

ABSTRACT

In one embodiment, the method incorporates a first set of words that belong to a topic in a set of topics that correspond to a set of genomes in a model, the words being incorporated in the model via a first item in the model. Then, the method incorporates a relationship between a second set of words that are associated with topics in the set of topics via a second item in the model. The model is trained with respect to the first item and the second item to determine a probability distribution of terms for the set of topics based on analyzing textual information for a plurality of media programs. The method further scores terms for each of the plurality of media programs based on the trained model to rank topics that correspond to genomes, the genomes describing characteristics for each media program.

CROSS REFERENCES TO RELATED APPLICATIONS

The present application is a continuation of U.S. application Ser. No.14/157,800, filed Jan. 17, 2014, entitled “TOPIC MODEL BASED MEDIAPROGRAM GENOME GENERATION”, which is incorporated by reference herein inits entirety.

BACKGROUND

A media program genome includes words or phrases that reflect salientcharacteristics of a media program, such as a television show or movie.The genome may not be just a tag or a keyword because the genome iswell-defined and also synonyms have been removed.

The genome can reflect different aspects of characteristics. Forexample, the aspects may be divided into factual aspects and semanticaspects. Factual aspects may not be very relevant to the storyline ofthe media program, but rather present factual information related to themedia program, such as the actor, director, studio, language, awards,production country, release year, source, and so on. The semanticaspects may be more relevant to the storyline, such as comments, genre,mood, story time period, location, plot, property, score, and so on.

For each aspect, a company may define some words or phrases as thegenome. For example, for the genre aspect, the words or phrases ofAction, Animation, Drama, Comedy, Western, and Documentary are assignedto the genre aspect. Each word may be a genome and within each genome,terms may be assigned for the genome, such as Action is associated withthe terms “action” and “feats”. The process of assigning words for thegenome may be a manual process and subject to subjective views of theassigner.

SUMMARY

In one embodiment, the method incorporates a first set of words thatbelong to a topic in a set of topics that correspond to a set of genomesin a model, the words being incorporated in the model via a first itemin the model. Then, the method incorporates a relationship between asecond set of words that are associated with topics in the set of topicsvia a second item in the model. The model is trained with respect to thefirst item and the second item to determine a probability distributionof terms for the set of topics based on analyzing textual informationfor a plurality of media programs. The method further scores terms foreach of the plurality of media programs based on the trained model torank topics that correspond to genomes, the genomes describingcharacteristics for each media program.

In one embodiment, a non-transitory computer-readable storage mediumcontains instructions, that when executed, control a computer system tobe configured for: incorporating a first set of words that belong to atopic in a set of topics that correspond to a set of genomes in a model,the words being incorporated in the model via a first item in the model;incorporating a relationship between a second set of words that areassociated with topics in the set of topics via a second item in themodel; training the model with respect to the first item and the seconditem to determine a probability distribution of terms for the set oftopics based on analyzing textual information for a plurality of mediaprograms; and scoring terms for each of the plurality of media programsbased on the trained model to rank topics that correspond to genomes,the genomes describing characteristics for each media program.

In one embodiment, an apparatus includes: one or more computerprocessors; and a non-transitory computer-readable storage mediumcomprising instructions, that when executed, control the one or morecomputer processors to be configured for: incorporating a first set ofwords that belong to a topic in a set of topics that correspond to a setof genomes in a model, the words being incorporated in the model via afirst item in the model; incorporating a relationship between a secondset of words that are associated with topics in the set of topics via asecond item in the model; training the model with respect to the firstitem and the second item to determine a probability distribution ofterms for the set of topics based on analyzing textual information for aplurality of media programs; and scoring terms for each of the pluralityof media programs based on the trained model to rank topics thatcorrespond to genomes, the genomes describing characteristics for eachmedia program.

The following detailed description and accompanying drawings provide abetter understanding of the nature and advantages of particularembodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a simplified system for generating a genome according toone embodiment.

FIG. 2 depicts a simplified flowchart of a method for training the topicmodel according to one embodiment.

FIG. 3 shows an example of a trained model output according to oneembodiment.

FIG. 4 depicts a simplified flowchart of a method for calculating jointprobability distribution of media programs and the genome according toone embodiment.

FIG. 5A shows an example of generating the genome according to oneembodiment.

FIG. 5B shows a chart an example of the joint distribution of genomesand shows according to one embodiment.

FIG. 6 depicts a video streaming system in communication with multipleclient devices via one or more communication networks according to oneembodiment.

FIG. 7 depicts a diagrammatic view of an apparatus for viewing videocontent and advertisements.

DETAILED DESCRIPTION

Described herein are techniques for a genome generation system. In thefollowing description, for purposes of explanation, numerous examplesand specific details are set forth in order to provide a thoroughunderstanding of particular embodiments. Particular embodiments asdefined by the claims may include some or all of the features in theseexamples alone or in combination with other features described below,and may further include modifications and equivalents of the featuresand concepts described herein.

Particular embodiments use a constrained topic model to generate agenome for media programs. The system uses a topic model that determinesa probability distribution of terms for topics found in the mediaprograms. Then, the system determines a number of topics for each mediaprogram based on the trained topic model. For example, the system uses aterm frequency for terms found in the media programs to rank the topicsfor each media program. The top ranked topics are then considered thegenomes for each media program.

System Overview

FIG. 1 depicts a simplified system 100 for generating genomes accordingto one embodiment. System 100 includes a topic model trainer 102 and agenome determiner 104. Topic model trainer 102 may receive informationfor media programs and domain knowledge (e.g., keywords for genomes),and output a trained topic model. The media program information may betextual information about various media programs, which may includestorylines, captions, comments about the media program, a synopsis ofthe media program, or other relevant information for the media program.In one example, the media program is a video and the textual informationis a synopsis of the storyline, comments from users, a transcript of thedialogue, etc. Topic model trainer 102 receives the textual informationand genome keywords, and trains the topic model.

Genome determiner 104 receives the trained topic model and outputs agenome-media program joint distribution. As discussed above, the genomemay include words or phrases that reflect characteristics of the mediaprogram. In one embodiment, one genome for a media program is expressedas one topic determined by the trained topic model. The media programmay also be associated with multiple genomes. The trained topic modelmay output a probability distribution for terms (e.g., word or phrase)found in the topic. For example, the topic of “Action” may be associatedwith the words action, feats, fight, violence, chase, struggle, etc. Thetopic model includes a probability distribution for the words in thetopic. The probability distribution assigns a probability for each wordor term in the topic. As discussed above, the genome may be a word or aphrase. The genome's concept can be expressed by the terms in itscorresponding topic. Genome determiner 104 then takes a term frequency(e.g., a number of times a word or phrase occurs in the media program)in each media program and applies the term frequency to the trainedtopic model. This scores or ranks the topics with respect to each mediaprogram. For example, if a media program includes a term that is used alot, and that term is mainly in the Action topic, then the Action topicmay be highly rated. Then, genome determiner 104 determines thetop-rated topics for each media program based on the applying of theterm frequency to the trained topic model. The topics can be related tothe genomes for the media program. Thus, the top-rated topics are thegenome for the media program. In one embodiment, genome determiner 104determines a joint probability distribution function of media programsand genomes as will be described in more detail below. For the jointprobability function, given at least two random variables X, Y, . . . ,that are defined on a probability space, the joint probabilitydistribution for X, Y, . . . is a probability distribution that givesthe probability that each of X, Y, . . . falls in any particular rangeor discrete set of values specified for that variable. In this case, therandom variables may be the media program and genome.

To train the topic model, a corpus may be input into the model. Thecorpus may be several documents. In one embodiment, each document D maycorrespond to a media program. For example, a description of the mediaprogram includes textual information that may be a word combination.Each word W may be considered an element of the document. Also, eachmedia program may have a different description based on the mediaprogram. For example, a movie #1 has a different description of actorsand plot than a movie #2. Topic model trainer 102 then trains the topicmodel to determine topics T for the corpus. The trained topic modelextracts latent topic structures of the corpus. The topics are latent inthat they are not explicitly defined in the documents. For example,topic model trainer 102 makes use of word co-occurrence in each documentto train the model. The trained topic model may be viewed as a softclustering of documents into several topics. A document may also spreadto more than one cluster. For example, a document contains several wordsand the document's content may contain multiple topics. A topicexpresses the meaning of the document and a document may have more thanone topic. Also, a topic is composed by some related words. That is, atopic is a probability distribution of unique words in the corpus.

To generate the genomes for each media program, certain requirements andassumptions are used. For example, a media program corresponds to adocument and genomes correspond to specific topics. The media program isa probability distribution of topics then. Particular embodimentsspecify certain information such that the topic model may be used togenerate the genomes. For example, in the genome generation, topic modeltrainer 102 may specify the genome-topic correspondence, specify whichwords should or should not have a high probability in a specific topic,and prevent small topics from being subsumed by large topics. Thetraining of the topic model combines the three requirements above, whichcan be used to generate the genomes for the media programs. In trainingthe topic model, topic model trainer 102 may receive a specification ofwords that are included in a topic. Also, topic model trainer 102 mayreceive a specification of words that must co-occur in the same topic orcannot occur in the same topic. This uses domain knowledge to generatethe genome. Further, the topic size is adaptive to the corpus. Forexample, a small topic may correspond to a small concept, such as highschool volleyball championship. The small topic may contain only 10words with 30 occurrences in the corpus in total. Another very big topicmay be the World Cup of Soccer, which contains more than 100 words withmore than 1000 occurrences in the corpus in total. If different topicsare assumed to have similar size, this assumption enforces the wordsfrom large concept to leak into the small concepts. This phenomenon isthat small topic can be submerged by large topic. However, in particularembodiments, this restriction can be removed by importing morerelaxation in the topic model as described below.

The above allows topic model trainer 102 to output a trained topic modelthat can be used to determine the genomes for media programs. Forexample, the trained topic model provides a probability distribution ofterms for topics. Then, genome determiner 104 may determine the highestranked topics for each media program. For example, based on the termfrequency found in each document associated with a media program, genomedeterminer 104 may score each term in the topics based on term frequencyfor each media program. Then, genome determiner 104 may normalize theresult and take the top number of topics based on the scores for eachmedia program. The topics associated with each media program constitutethe genomes for the media program. In one embodiment, the scores betweentopics and the media programs provide a joint distribution of mediaprograms and genomes.

The output of genome determiner 104 may provide the joint probabilitydistribution that a genome belongs to the media program. This jointprobability distribution may be used to provide various services. Forexample, the joint probability distribution the genome applies to themedia program may be used in personalization, recommendation, andsearch. For example, if a company providing a service for sending mediaprograms to users would like to have a group or tray on the websiteassociated with the genre Action, the company may determine the top tenvideos that have the highest probability of having the genome Action.These videos are displayed in the tray. Using the genome, the videosdisplayed may the highest probability of being relevant to the genreAction.

Model Training

FIG. 2 depicts a simplified flowchart 200 of a method for training thetopic model according to one embodiment. At 202, topic model trainer 102receives the keywords for each genome. For example, topic model trainer102 may receive input from a user that specifies keywords that arerelated to each genome. The keywords may be determined based on domainknowledge, which may be knowledge a company gleans from providing thecompany's service for sending videos or media programs to users. Forexample, the keywords of action may be related to the genre Action, thekeyword adventurous may be related to the genre Adventure, etc. Thesewords are some initial terms for the genome corresponded topic. Settingsome initial terms helps the topic model find all the terms that relatedto the genome in the topic. Further, using domain knowledge to specifythe keywords may provide more accurate training of the topic model forthe purpose desired. That is, the trained topic model finds terms thatare considered more related to the topic based on the specified termsdue to the use of domain knowledge.

At 204, topic model trainer 102 relates the words for each genome toeach topic. This relates a genome to a topic. For example, the wordsdefined in 202 for each genome are associated with a topic. In oneembodiment, topic model trainer 102 may receive input from a user thatrelates the words of each genome to a topic. The relationship may bedetermined based on domain knowledge, such as a company may determinethe topics and also the relationship of topics to genomes based on itsknowledge of sending videos to users.

At 206, topic model trainer 102 specifies a number of topics T and anumber of genomes L. In one embodiment, the topic number T should belarger than the genome number L. This means that there may be moretopics found than genomes that are defined for all media programs. Thatis, even though a probability distribution of 50 topics is associatedwith a video, only 30 of the topics are used in the genome for thevideo. This is because not all words occur in the textural informationare related to the defined genomes. One example is the common words orfunctional words. Additional topics may act as a place to carry otherwords.

At 208, topic model trainer 102 trains the topic model. For example,topic model trainer 102 receives the textual information for mediaprograms as input to the topic model. Topic model trainer 102 may findnew topics and also use the specified topics. The output from the topicmodel is a trained topic model that includes a probability distributionof terms for topics found in the textual information. The probabilitydistribution of terms in the topics includes a word distribution foreach topic. One word may occur in several topics since it may containdifferent meanings, such as ‘apple’.

FIG. 3 shows an example of a trained model output according to oneembodiment. At 302, various genre aspects are listed of Action,Adventure, . . . , and Western. Each genre is a topic and corresponds toa genome.

At 304, a distribution of words (e.g., terms) is listed for each genome.The distribution of words includes probabilities that are associatedwith each word. For example, for the Action genome, the words action:0.23, feats: 0.16, fight: 0.15, violence: 0.09, chase 0.09, struggle:0.07, etc. are provided. The numbers after the words represent thecontribution (e.g., probability) of the word to the genome. For the term“feats” in the genome Action, the term contributes a 16% probability tothe genome Action from the terms listed in the genome Action. Also, inone embodiment, the term action may have been specified for the genomeAction in this case as was described in 202 and 204 in FIG. 2, but thetopic model may have determined the word “feats” as being part of thetopic Action. The topic model uses word co-occurrence informationdetermined from the corpus to determine the words in each topic and theprobability distribution. The co-occurrence may be theconcurrence/coincidence or the frequent occurrence of two terms from thecorpus alongside each other in a certain order. Co-occurrence can beinterpreted as an indicator of semantic proximity or an idiomaticexpression. The topic model exploits the co-occurrence to determineterms for each topic.

Genome Generation

Once topic model trainer 102 determines the trained topic model, genomedeterminer 104 may generate the genome for the media programs. FIG. 4depicts a simplified flowchart 400 of a method for calculating jointprobability distribution of media programs and the genome according toone embodiment. At 402, genome determiner 104 calculates a termfrequency for each media program. For example, genome determiner 104determines how many times each word occurs in the textual informationfor each media program. In one example, genome determiner 104 onlydetermines the term frequency for specific words and not every word inthe textual information.

At 404, genome determiner 104 scores the first L topics for each mediaprogram. As described above, the number of genomes that was specified isthe number L. Thus, genome determiner 104 determines the first L topicscorresponding to genomes that are determined for each media program.This narrows the number of topics to the desired number of genomes. Thescoring is based on the term frequency and which topics the terms areassociated with. For example, a term may occur five times for a video.If that term is found in a topic, then genome determiner 104 scores thetopic based on the frequency of five occurrences and the probability forthe term assigned by the trained model. For example, 5 occurrences ofthe word feats yields 5*0.16=0.90 score for the term feats in the topicAction for the media program. In one example, the higher the termfrequency means the higher probability the topic is associated with themedia program.

At 406, genome determiner 104 normalizes the scores for each mediaprogram to determine a joint distribution of genomes and media programs.The normalization removes any popularity skew from the scores that mayoccur because a media program is popular. For example, a media programmay be very popular and thus may have the term frequency be higher. Thenormalization removes the popularity by considering the marginaldistribution of media programs to be uniform. In other words, mediaprograms are equal in front of genome. As described above, each topic isassociated with a genome. Thus, the joint distribution of media programsand topics may be the probability that a genome belongs to a mediaprogram.

FIG. 5A shows an example of generating the genome according to oneembodiment. A trained model 500 includes a topic in each row502-1-502-T. Each topic includes terms 504-1-504-T. For example, thetopic in row 502-1 includes terms 504-1. These are terms that the topicmodel determined were associated with this topic based on an analysis ofthe textual information for the media programs.

A chart 506 shows a term frequency for each media program. For example,each column 508-1-508-N is associated with a media program and includesthe term frequencies 509-1-509-N for multiple terms in each respectivemedia program. In one example, for media program #1, the term “feats” isfound 5 times. Also, for media program #2, the term “feats” is found 8times.

When applying the term frequency to the trained model, genome determiner104 may multiply the frequency for a term by the probability for theterm in the model. Although a multiplication is discussed, otheroperations may be performed to apply the term frequency to theprobability. In one example, if a media program #1 includes fiveinstances of the word feats, then the probability that of feats isincreased five times (5*0.16). Also, if a media program #2 includeseight instances of the word feats, then the probability that of feats isincreased eight times (8*0.16). This calculation is performed for eachof the terms of media program #1 (and all other media programs).

A chart 512 shows a normalization of applying the term frequency to thetrained model. As discussed above, the normalization may adjust thevalues to a common scale to account for irregularities, such aspopularity of a media program. Also, chart 512 includes the top Ltopics. This is the number of genomes that are desired. By relatinggenomes to topics and documents to media programs, chart 512 provides ajoint distribution of genomes and media programs. FIG. 5B shows a chart550 an example of the joint distribution of genomes and shows accordingto one embodiment. Columns 552-1, 552-2, . . . , 552-N show differentvideos of Video #1, Video #2, . . . , and Video #N. Also, rows 554-1,554-2, . . . , 554-N show different genomes of Action, Adventure, . . ., and Western. For each genome, the joint probability distribution forthe genome and the media programs is shown. For example, in row 554-1,for the genome Action, the joint probability for Video #1 is 0.56, 0.03for Video #2, and 0.26 for Video #N. In this case, the genome Action ismore likely applicable to Video #1 than Video #2 or Video #N due tohaving a higher score. The higher score may be due to Video #1 includingmore terms found in the topic Action, and/or a higher term frequency forthe terms found in the topic Action. Also, Video #N may have a higherprobability that the genome Adventure is applicable to it due to havinga higher score (0.68) than the other videos. The joint distribution islisted for each genome and the media programs.

In one example, a media program may have the words action, adventurous,and energetic in the textual information. In this case, the genome forthe joint distribution for the genome of the media program may listaction and adventure with a high probability due to the words action,adventurous, and energetic being associated with the media program.However, because no words in the media program found in the westerngenome in the textual information, the western genome may have a lowerprobability being associated with the media program.

Example Algorithm

To train the topic model, particular embodiments may use two approaches:variational inference and Gibbs sampling. The algorithm uses Gibbssampling, but variational inference may be used or other approaches. InGibbs sampling, the updating rule is the key formulation. Theprobability of a topic (e.g., a topic index identifying the topic) for acurrent word (n^(th) words in d^(th) document) z_(n) ^((d)) given allwords W, topic indexes for all words but current word Z_(−d,n), and somehyper-parameters h, i.e. P(z_(n) ^((d))=t|W,Z_(−d,n),h), should beupdated during the iterative sampling.

The following is a first algorithm to determine the topic model,

P(z _(n) ^((d)) =t|W,Z _(−d,n),α,β)∝P(w _(n) ^((d)) =v|z _(n) ^((d))=t,W _(−d,n) ,Z _(−d,n),β)P(z _(n) ^((d)) =t|Z _(−d,n),α)

where the first item on the right is

${P\left( {{w_{n}^{(d)} = {{vz_{n}^{(d)}} = t}},W_{{- d},n},Z_{{- d},n},\beta} \right)} \propto \frac{N_{vt}^{{- d},n} + {\beta/V}}{N_{t}^{{- d},n} + \beta}$

and the second item is

${P\left( {{z_{n}^{(d)} = {tZ_{{- d},n}}},\alpha} \right)} \propto \frac{N_{td}^{{- d},n} + {\alpha/T}}{N_{d} - 1 + \alpha}$

where N_(v|t) ^(−d,n) is the number of words except the current onewhich are v and belong to topic t, N_(t) ^(−d,n) is the number of wordsexcept the current one which belong to topic t. N_(v|d) ^(−d,n) is thenumber of words except the current one which are in document d andbelong to topic t. N_(d) is the number of documents in corpus. T is thenumber of topics.

The following second algorithm prevents small topics from submerged bylarge topics. In this case, only the second item is different from thefirst algorithm:

${P\left( {{z_{n}^{(d)} = {tZ_{{- d},n}}},\alpha,\alpha^{\prime}} \right)} \propto \begin{matrix}\frac{N_{td}^{{- d},n} + {\alpha \frac{N_{t}^{{- d},n} + {\alpha^{\prime}/T}}{W - 1 + \alpha^{\prime}}}}{N_{d} - 1 + \alpha}\end{matrix}$

where W is the number of words in corpus. The second algorithmintroduces hyper-parameter α′ to relax the model and prevent smalltopics from submerged by large topics.

The following third algorithm allows words to be associated with topicindexes. In this case, only the first item is different from the firstalgorithm:

${P\left( {{w_{n}^{(d)} = {{vz_{n}^{(d)}} = t}},W_{{- d},n},Z_{{- d},n},\beta} \right)} \propto {\frac{N_{vt}^{{- d},n} + {\beta/V}}{N_{t}^{{- d},n} + \beta}{\delta \left( {t \in C_{n}^{(d)}} \right)}}$

where δ(tεC_(n) ^((d))) restricts z_(n) ^((d))=t to a subset of valuesC_(n) ^((d)) which is a set of topic indexes. In other words, the n^(th)words in d^(th) document should only belong to topics in C_(n) ^((d)).Thus, the third algorithm associates words with topic indexes.

The following fourth algorithm allows domain knowledge to beincorporated into the model and also relationships between words can bespecified. In this case, only the first item is different from the firstalgorithm:

${P\left( {{w_{n}^{(d)} = {{vz_{n}^{(d)}} = t}},W_{{- d},n},Z_{{- d},n},q_{1\text{:}T}} \right)} \propto \begin{matrix}{\prod\limits_{s \in {I_{t}{({\uparrow i})}}}\; \frac{\gamma_{t}^{C_{t}{({s \downarrow i})}} + n_{{- i},t}^{C_{t}{({s \downarrow i})}}}{\sum_{k \in {C_{t}{(s)}}}\left( {\gamma_{t}^{(k)} + n_{{- i},t}^{(k)}} \right)}}\end{matrix}$

where i is the global index of the n^(th) words in d^(th) document, andq_(1:T) are Dirichlet trees modeled by γ and β. I_(t)(↑i) denotes thesubset of internal nodes in topic t's Dirichlet tree that are ancestorof leaf w_(i). C_(t) (s↓i) is the unique node that is s's immediatechild and an ancestor of w_(i) (including w_(i) itself). The fourthalgorithm adopts Dirichlet forest as prior distribution of parameter φ(which is the word distribution of each topic) instead of Dirichletdistribution. Thus, domain knowledge can be incorporated into the topicmodel. For example, particular embodiments can specify that ‘apple’ and‘orange’ co-occur with a high probability in a same topic, and ‘apple’and ‘football’ do not to co-occur in a same topic. Different rules ofdomain knowledge lead to different tree structures.

Particular embodiments combine algorithms 2, 3, and 4 into one:

The first item is the combination of those of third algorithm and thefourth algorithm:

${P\left( {{w_{n}^{(d)} = {{vz_{n}^{(d)}} = t}},W_{{- d},n},Z_{{- d},n},q_{1\text{:}T}} \right)} \propto {\prod\limits_{s \in {I_{t}{({\uparrow i})}}}{\frac{\gamma_{t}^{C_{t}{({s \downarrow i})}} + n_{{- i},t}^{C_{t}{({s \downarrow i})}}}{\sum_{k \in {C_{t}{(s)}}}\left( {\gamma_{t}^{(k)} + n_{{- i},t}^{(k)}} \right)}{\delta \left( {t \in C^{(i)}} \right)}}}$

and the second item is that of second algorithm:

${P\left( {{z_{n}^{(d)} = {tZ_{{- d},n}}},\alpha,\alpha^{\prime}} \right)} \propto \frac{N_{td}^{{- d},n} + {\alpha \frac{N_{t}^{{- d},n} + {\alpha^{\prime}/T}}{W - 1 + \alpha^{\prime}}}}{N_{d} - 1 + \alpha}$

Thus, the updating rule of the topic model based on the first algorithmbecomes:

${P\left( {{z_{n}^{(d)} = {tW}},Z_{{- d},n},h} \right)} \propto {{P\left( {{z_{n}^{(d)} = {tZ_{{- d},n}}},\alpha,\alpha^{\prime}} \right)}{P\left( {{w_{n}^{(d)} = {{vz_{n}^{(d)}} = t}},W_{{- d},n},Z_{{- d},n},q_{1\text{:}T}} \right)}} \propto {\begin{matrix}\frac{N_{td}^{{- d},n} + {\alpha \frac{N_{t}^{{- d},n} + {\alpha^{\prime}/T}}{W - 1 + \alpha^{\prime}}}}{N_{d} - 1 + \alpha}\end{matrix}\begin{matrix}{\prod\limits_{s \in {I_{t}{({\uparrow i})}}}\; \frac{\gamma_{t}^{C_{t}{({s \downarrow i})}} + n_{{- i},t}^{C_{t}{({s \downarrow i})}}}{\Sigma_{k \in {C_{t}{(s)}}}\left( {\gamma_{t}^{(k)} + n_{{- i},t}^{(k)}} \right.}}\end{matrix}\begin{matrix}{\delta \left( {t \in C^{(i)}} \right)}\end{matrix}}$

System Overview

Features and aspects as disclosed herein may be implemented inconjunction with a video streaming system 600 in communication withmultiple client devices via one or more communication networks as shownin FIG. 6. Aspects of the video streaming system 600 are describedmerely to provide an example of an application for enabling distributionof content prepared according to the present disclosure. It should beappreciated that the present technology is not limited to streamingvideo applications, and may be adapted for other applications.

Video data may be obtained from one or more sources for example, from avideo source 610, for use as input to a video content server 602. Theinput video data may comprise raw or edited frame-based video data inany suitable digital format, for example, MPEG-1, MPEG-2, MPEG-4, VC-1,or other format. In an alternative, a video may be provided in anon-digital format and converted to digital format using a scannerand/or transcoder. The input video data may comprise video clips orprograms of various types, for example, television episodes, motionpictures, and other content produced as primary content of interest toconsumers.

The video streaming system 600 may include one or more computer serversor modules 602, 604, and/or 607 distributed over one or more computers.Each server 602, 604, 607 may include, or may be operatively coupled to,one or more data stores 609, for example databases, indexes, files, orother data structures. A video content server 602 may access a datastore (not shown) of various video segments. The video content server602 may serve the video segments as directed by a user interfacecontroller communicating with a client device. As used herein, a videosegment refers to a definite portion of frame-based video data, such asmay be used in a streaming video session to view a television episode,motion picture, recorded live performance, or other video content.

In some embodiments, a video advertising server 604 may access a datastore of relatively short videos (e.g., 10 second, 30 second, or 60second video advertisements) configured as advertising for a particularadvertiser or message. The advertising may be provided for an advertiserin exchange for payment of some kind, or may comprise a promotionalmessage for the system 600, a public service message, or some otherinformation. The video advertising server 604 may serve the videoadvertising segments as directed by a user interface controller (notshown).

The video streaming system 600 also may include system 100 fordetermining the genome for the media programs.

The video streaming system 600 may further include an integration andstreaming component 607 that integrates video content and videoadvertising into a streaming video segment. For example, streamingcomponent 607 may be a content server or streaming media server. Acontroller (not shown) may determine the selection or configuration ofadvertising in the streaming video based on any suitable algorithm orprocess. The video streaming system 600 may include other modules orunits not depicted in FIG. 6, for example administrative servers,commerce servers, network infrastructure, advertising selection engines,and so forth.

The video streaming system 600 may connect to a data communicationnetwork 612. A data communication network 612 may comprise a local areanetwork (LAN), a wide area network (WAN), for example, the Internet, atelephone network, a wireless cellular telecommunications network (WCS)614, or some combination of these or similar networks.

One or more client devices may be in communication with the videostreaming system 600, via the data communication network 612 and/orother network 614. Such client devices may include, for example, one ormore laptop computers 622, desktop computers 620, “smart” mobile phones627, notepad devices 624, network-enabled televisions 628, orcombinations thereof, via a router 618 for a LAN, via a base station 616for a wireless telephony network 614, or via some other connection. Inoperation, such client devices 620, 622, 624, 627, or 628 may send andreceive data or instructions to the system 600, in response to userinput received from user input devices or other input. In response, thesystem 600 may serve video segments and metadata from the data store 609responsive to selection of interactive links to the client devices 620,622, 624, 627, or 628 and customize the additional content based onparameters of the client devices, for example respective geographiclocations of the client devices, or demographic information concerningrespective users of the client devices. The devices 620, 622, 624, 627,or 628 may output interactive video content from the streaming videosegment using a display screen, projector, or other video output device,and receive user input for interacting with the video content.

Distribution of audio-video data may be implemented from streamingcomponent 607 to remote client devices over computer networks,telecommunications networks, and combinations of such networks, usingvarious methods, for example streaming. In streaming, a content serverstreams audio-video data continuously to a media player componentoperating at least partly on the client device, which may play theaudio-video data concurrently with receiving the streaming data from theserver. Although streaming is discussed, other methods of delivery maybe used. The media player component may initiate play of the video dataimmediately after receiving an initial portion of the data from thecontent provider. Traditional streaming techniques use a single providerdelivering a stream of data to a set of end users. High bandwidths andprocessing power may be required to deliver a single stream to a largeaudience, and the required bandwidth of the provider may increase as thenumber of end users increases.

Streaming media can be delivered on-demand or live. Streaming enablesimmediate playback at any point within the file. End-users may skipthrough the media file to start playback or change playback to any pointin the media file. Hence, the end-user does not need to wait for thefile to progressively download. Typically, streaming media is deliveredfrom a few dedicated servers having high bandwidth capabilities via aspecialized device that accepts requests for video files, and withinformation about the format, bandwidth and structure of those files,delivers just the amount of data necessary to play the video, at therate needed to play it. Streaming media servers may also account for thetransmission bandwidth and capabilities of the media player on thedestination client. Unlike the web server, the streaming component 607may communicate with the client device using control messages and datamessages to adjust to changing network conditions as the video isplayed. These control messages can include commands for enabling controlfunctions such as fast forward, fast reverse, pausing, or seeking to aparticular part of the file at the client.

Since streaming component 607 transmits video data only as needed and atthe rate that is needed, precise control over the number of streamsserved can be maintained. The viewer will not be able to view high datarate videos over a lower data rate transmission medium. However,streaming media servers (1) provide users random access to the videofile, (2) allow monitoring of who is viewing what video programs and howlong they are watched (3) use transmission bandwidth more efficiently,since only the amount of data required to support the viewing experienceis transmitted, and (4) the video file is not stored in the viewer'scomputer, but discarded by the media player, thus allowing more controlover the content.

Streaming component 607 may use HTTP and TCP to deliver video streams,but generally use RTSP (real time streaming protocol) and UDP (userdatagram protocol). These protocols permit control messages and savebandwidth by reducing overhead. Unlike TCP, when data is dropped duringtransmission, UDP does not transmit resent requests. Instead, the servercontinues to send data. Streaming component 607 can also deliver livewebcasts and can multicast, which allows more than one client to tuneinto a single stream, thus saving bandwidth. Streaming media players maynot rely on buffering to provide random access to any point in the mediaprogram. Instead, this is accomplished through the use of controlmessages transmitted from the media player to the streaming mediaserver. Another protocol used for streaming is hypertext transferprotocol (HTTP) live streaming (HLS). The HLS protocol delivers videoover HTTP via a playlist of small segments that are made available in avariety of bitrates typically from one or more content delivery networks(CDNs). This allows a media player to switch both bitrates and contentsources on a segment-by-segment basis. The switching helps compensatefor network bandwidth variances and also infrastructure failures thatmay occur during playback of the video.

The delivery of video content by streaming may be accomplished under avariety of models. In one model, the user pays for the viewing of eachvideo program, for example, using a pay-per-view service. In anothermodel widely adopted by broadcast television shortly after itsinception, sponsors pay for the presentation of the media program inexchange for the right to present advertisements during or adjacent tothe presentation of the program. In some models, advertisements areinserted at predetermined times in a video program, which times may bereferred to as “ad slots” or “ad breaks.” With streaming video, themedia player may be configured so that the client device cannot play thevideo without also playing predetermined advertisements during thedesignated ad slots.

Output from a media player on the client device may occupy only aportion of total screen area available on a client device, particularlywhen bandwidth limitations restrict the resolution of streaming video.Although media players often include a “full screen” viewing option,many users prefer to watch video in a display area smaller than fullscreen, depending on the available video resolution. Accordingly, thevideo may appear in a relatively small area or window of an availabledisplay area, leaving unused areas. A video provider may occupy theunused area with other content or interface objects, includingadditional advertising, such as, for example, banner ads. Banner ads orsimilar additional content may be provided with links to an additionalweb site or page, so that when a user “clicks on” or otherwise selectsthe banner ad, the additional web site or page opens in a new window.

Referring to FIG. 7, a diagrammatic view of an apparatus 700 for viewingvideo content and advertisements is illustrated. In selectedembodiments, the apparatus 700 may include a processor (CPU) 702operatively coupled to a processor memory 704, which holds binary-codedfunctional modules for execution by the processor 702. Such functionalmodules may include an operating system 706 for handling systemfunctions such as input/output and memory access, a browser 708 todisplay web pages, and media player 710 for playing video. The memory704 may hold additional modules not shown in FIG. 7, for example modulesfor performing other operations described elsewhere herein.

A bus 714 or other communication component may support communication ofinformation within the apparatus 700. The processor 702 may be aspecialized or dedicated microprocessor configured to perform particulartasks in accordance with the features and aspects disclosed herein byexecuting machine-readable software code defining the particular tasks.Processor memory 704 (e.g., random access memory (RAM) or other dynamicstorage device) may be connected to the bus 714 or directly to theprocessor 702, and store information and instructions to be executed bya processor 702. The memory 704 may also store temporary variables orother intermediate information during execution of such instructions.

A computer-readable medium in a storage device 724 may be connected tothe bus 714 and store static information and instructions for theprocessor 702; for example, the storage device (CRM) 724 may store themodules 706, 708, 710 and 712 when the apparatus 700 is powered off,from which the modules may be loaded into the processor memory 704 whenthe apparatus 700 is powered up. The storage device 724 may include anon-transitory computer-readable storage medium holding information,instructions, or some combination thereof, for example instructions thatwhen executed by the processor 702, cause the apparatus 700 to beconfigured to perform one or more operations of a method as describedherein.

A communication interface 716 may also be connected to the bus 714. Thecommunication interface 716 may provide or support two-way datacommunication between the apparatus 700 and one or more externaldevices, e.g., the streaming system 600, optionally via a router/modem726 and a wired or wireless connection 725. In the alternative, or inaddition, the apparatus 700 may include a transceiver 718 connected toan antenna 728, through which the apparatus 700 may communicatewirelessly with a base station for a wireless communication system orwith the router/modem 726. In the alternative, the apparatus 700 maycommunicate with a video streaming system 600 via a local area network,virtual private network, or other network. In another alternative, theapparatus 700 may be incorporated as a module or component of the system600 and communicate with other components via the bus 714 or by someother modality.

The apparatus 700 may be connected (e.g., via the bus 714 and graphicsprocessing unit 720) to a display unit 728. A display 728 may includeany suitable configuration for displaying information to an operator ofthe apparatus 700. For example, a display 728 may include or utilize aliquid crystal display (LCD), touchscreen LCD (e.g., capacitivedisplay), light emitting diode (LED) display, projector, or otherdisplay device to present information to a user of the apparatus 700 ina visual display.

One or more input devices 730 (e.g., an alphanumeric keyboard,microphone, keypad, remote controller, game controller, camera or cameraarray) may be connected to the bus 714 via a user input port 722 tocommunicate information and commands to the apparatus 700. In selectedembodiments, an input device 730 may provide or support control over thepositioning of a cursor. Such a cursor control device, also called apointing device, may be configured as a mouse, a trackball, a track pad,touch screen, cursor direction keys or other device for receiving ortracking physical movement and translating the movement into electricalsignals indicating cursor movement. The cursor control device may beincorporated into the display unit 728, for example using a touchsensitive screen. A cursor control device may communicate directioninformation and command selections to the processor 702 and controlcursor movement on the display 728. A cursor control device may have twoor more degrees of freedom, for example allowing the device to specifycursor positions in a plane or three-dimensional space.

Particular embodiments may be implemented in a non-transitorycomputer-readable storage medium for use by or in connection with theinstruction execution system, apparatus, system, or machine. Thecomputer-readable storage medium contains instructions for controlling acomputer system to perform a method described by particular embodiments.The computer system may include one or more computing devices. Theinstructions, when executed by one or more computer processors, may beconfigured to perform that which is described in particular embodiments.

As used in the description herein and throughout the claims that follow,“a”, “an”, and “the” includes plural references unless the contextclearly dictates otherwise. Also, as used in the description herein andthroughout the claims that follow, the meaning of “in” includes “in” and“on” unless the context clearly dictates otherwise.

The above description illustrates various embodiments along withexamples of how aspects of particular embodiments may be implemented.The above examples and embodiments should not be deemed to be the onlyembodiments, and are presented to illustrate the flexibility andadvantages of particular embodiments as defined by the following claims.Based on the above disclosure and the following claims, otherarrangements, embodiments, implementations and equivalents may beemployed without departing from the scope hereof as defined by theclaims.

What is claimed is:
 1. A method comprising: incorporating, by acomputing device, a first set of words that belong to a topic in a setof topics that correspond to a set of genomes in a model, the wordsbeing incorporated in the model via a first item in the model;incorporating, by the computing device, a relationship between a secondset of words that are associated with topics in the set of topics via asecond item in the model; training, by the computing device, the modelwith respect to the first item and the second item to determine aprobability distribution of terms for the set of topics based onanalyzing textual information for a plurality of media programs; andscoring, by the computing device, terms for each of the plurality ofmedia programs based on the trained model to rank topics that correspondto genomes, the genomes describing characteristics for each mediaprogram.
 2. The method of claim 1, further comprising incorporating, bythe computing device, a parameter that prevents topics from beingsubmerged by other topics that are larger via a third item in the model.3. The method of claim 1, wherein incorporating, by the computingdevice, the first set of words that belong to the topic in the set oftopics comprises restricting words to a subset of the set of topics viathe first item.
 4. The method of claim 1, wherein training the modelcomprises generating the probability distribution when the first set ofwords are in the topic.
 5. The method of claim 1, wherein training themodel comprises generating, by the computing device, the probabilitydistribution in view of the relationship between the second set of wordsthat are associated with the topics in the set of topics.
 6. The methodof claim 1, wherein the relationship specifies two words that co-occurin the topic or do not co-occur in the topic.
 7. The method of claim 1,wherein scoring comprises: determining, by the computing device, a termfrequency for terms found in the textual information for each mediaprogram; and scoring, by the computing device, the topics for each mediaprogram based on the probability distribution of terms for the set oftopics in the trained model and the corresponding term frequency.
 8. Themethod of claim 1, further comprising normalizing, by the computingdevice, the scoring of the topics.
 9. The method of claim 1, furthercomprising selecting, by the computing device, a number of highestscored topics for each media program as the genomes for each mediaprogram.
 10. The method of claim 1, wherein a joint distributionfunction of a plurality of media programs and the genomes is determined.11. The method of claim 10, wherein the joint distribution functioncomprises a joint probability the genome applies to the media programamong the plurality of media programs.
 12. The method of claim 1,further comprising: defining, by the computing device, information forthe set of genomes, the set of genomes describing characteristics ofmedia programs; and defining, by the computing device, which genomes inthe set of genomes correspond to which topics in the set of topics. 13.The method of claim 1, wherein training the model comprises: inputting,by the computing device, the textual information for the plurality ofmedia programs and the information for the set of genomes into themodel.
 14. The method of claim 1, wherein training the model comprises:determining, by the computing device, words other than the first set ofwords to associate with the set of topics based on term co-occurrence inthe textual information.
 15. A non-transitory computer-readable storagemedium containing instructions, that when executed, control a computersystem to be configured for: incorporating a first set of words thatbelong to a topic in a set of topics that correspond to a set of genomesin a model, the words being incorporated in the model via a first itemin the model; incorporating a relationship between a second set of wordsthat are associated with topics in the set of topics via a second itemin the model; training the model with respect to the first item and thesecond item to determine a probability distribution of terms for the setof topics based on analyzing textual information for a plurality ofmedia programs; and scoring terms for each of the plurality of mediaprograms based on the trained model to rank topics that correspond togenomes, the genomes describing characteristics for each media program.16. The non-transitory computer-readable storage medium of claim 15,further configured for incorporating a parameter that prevents topicsfrom being submerged by other topics that are larger via a third item inthe model.
 17. The non-transitory computer-readable storage medium ofclaim 16, wherein incorporating the first set of words that belong tothe topic in the set of topics comprises restricting words to a subsetof the set of topics via the first item.
 18. The non-transitorycomputer-readable storage medium of claim 17, wherein training the modelcomprises generating the probability distribution when the first set ofwords are in the topic.
 19. The non-transitory computer-readable storagemedium of claim 15, wherein training the model comprises generating, bythe computing device, the probability distribution in view of therelationship between the second set of words that are associated withthe topics in the set of topics.
 20. An apparatus comprising: one ormore computer processors; and a non-transitory computer-readable storagemedium comprising instructions, that when executed, control the one ormore computer processors to be configured for: incorporating a first setof words that belong to a topic in a set of topics that correspond to aset of genomes in a model, the words being incorporated in the model viaa first item in the model; incorporating a relationship between a secondset of words that are associated with topics in the set of topics via asecond item in the model; training the model with respect to the firstitem and the second item to determine a probability distribution ofterms for the set of topics based on analyzing textual information for aplurality of media programs; and scoring terms for each of the pluralityof media programs based on the trained model to rank topics thatcorrespond to genomes, the genomes describing characteristics for eachmedia program.